Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 1315 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 123.4 KiB |
| Average record size in memory | 96.1 B |
Variable types
| NUM | 11 |
|---|---|
| CAT | 1 |
zipcode is highly correlated with schooldist and 1 other fields | High correlation |
schooldist is highly correlated with zipcode | High correlation |
council is highly correlated with zipcode | High correlation |
lotarea is highly skewed (γ1 = 27.08598631) | Skewed |
df_index has unique values | Unique |
Reproduction
| Analysis started | 2021-05-29 22:19:56.939270 |
|---|---|
| Analysis finished | 2021-05-29 22:20:12.532016 |
| Duration | 15.59 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 1315 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 413750.4935 |
|---|---|
| Minimum | 7 |
| Maximum | 858669 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 7 |
|---|---|
| 5-th percentile | 16463.2 |
| Q1 | 164320.5 |
| median | 269590 |
| Q3 | 681588 |
| 95-th percentile | 797185.6 |
| Maximum | 858669 |
| Range | 858662 |
| Interquartile range (IQR) | 517267.5 |
Descriptive statistics
| Standard deviation | 280744.4121 |
|---|---|
| Coefficient of variation (CV) | 0.678535534 |
| Kurtosis | -1.637268931 |
| Mean | 413750.4935 |
| Median Absolute Deviation (MAD) | 260956 |
| Skewness | 0.1017818988 |
| Sum | 544081899 |
| Variance | 7.88174249e+10 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 602111 | 1 | 0.1% | |
| 2875 | 1 | 0.1% | |
| 465596 | 1 | 0.1% | |
| 275137 | 1 | 0.1% | |
| 797378 | 1 | 0.1% | |
| 160451 | 1 | 0.1% | |
| 746561 | 1 | 0.1% | |
| 201416 | 1 | 0.1% | |
| 269001 | 1 | 0.1% | |
| 267963 | 1 | 0.1% | |
| Other values (1305) | 1305 | 99.2% |
| Value | Count | Frequency (%) | |
| 7 | 1 | 0.1% | |
| 21 | 1 | 0.1% | |
| 32 | 1 | 0.1% | |
| 33 | 1 | 0.1% | |
| 394 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 858669 | 1 | 0.1% | |
| 855445 | 1 | 0.1% | |
| 853929 | 1 | 0.1% | |
| 853648 | 1 | 0.1% | |
| 853646 | 1 | 0.1% |
borough
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| MN | |
|---|---|
| BK | 95 |
| QN | 45 |
| BX | 33 |
| SI | 1 |
| Value | Count | Frequency (%) | |
| MN | 1141 | 86.8% | |
| BK | 95 | 7.2% | |
| QN | 45 | 3.4% | |
| BX | 33 | 2.5% | |
| SI | 1 | 0.1% |
Frequencies of value counts
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.1% |
Histogram of lengths of the category
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
block
Real number (ℝ≥0)
| Distinct | 735 |
|---|---|
| Distinct (%) | 55.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1198.855513 |
|---|---|
| Minimum | 4 |
| Maximum | 15638 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 4 |
|---|---|
| 5-th percentile | 26.7 |
| Q1 | 779 |
| median | 1104 |
| Q3 | 1363 |
| 95-th percentile | 2443 |
| Maximum | 15638 |
| Range | 15634 |
| Interquartile range (IQR) | 584 |
Descriptive statistics
| Standard deviation | 1198.05812 |
|---|---|
| Coefficient of variation (CV) | 0.9993348713 |
| Kurtosis | 42.90812459 |
| Mean | 1198.855513 |
| Median Absolute Deviation (MAD) | 298 |
| Skewness | 5.15274432 |
| Sum | 1576495 |
| Variance | 1435343.259 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 16 | 20 | 1.5% | |
| 1171 | 11 | 0.8% | |
| 763 | 10 | 0.8% | |
| 1118 | 9 | 0.7% | |
| 1269 | 8 | 0.6% | |
| 21 | 8 | 0.6% | |
| 993 | 7 | 0.5% | |
| 1295 | 7 | 0.5% | |
| 1158 | 7 | 0.5% | |
| 762 | 6 | 0.5% | |
| Other values (725) | 1222 | 92.9% |
| Value | Count | Frequency (%) | |
| 4 | 1 | 0.1% | |
| 5 | 1 | 0.1% | |
| 6 | 2 | 0.2% | |
| 8 | 1 | 0.1% | |
| 9 | 3 | 0.2% |
| Value | Count | Frequency (%) | |
| 15638 | 1 | 0.1% | |
| 15610 | 1 | 0.1% | |
| 10101 | 1 | 0.1% | |
| 9998 | 1 | 0.1% | |
| 7459 | 1 | 0.1% |
| Distinct | 27 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.240304183 |
|---|---|
| Minimum | 1 |
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 2 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 20 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 6.030620993 |
|---|---|
| Coefficient of variation (CV) | 1.422214241 |
| Kurtosis | 8.788644772 |
| Mean | 4.240304183 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.052984874 |
| Sum | 5576 |
| Variance | 36.36838956 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=27)
| Value | Count | Frequency (%) | |
| 2 | 988 | 75.1% | |
| 3 | 98 | 7.5% | |
| 13 | 38 | 2.9% | |
| 30 | 30 | 2.3% | |
| 1 | 21 | 1.6% | |
| 14 | 17 | 1.3% | |
| 5 | 16 | 1.2% | |
| 15 | 14 | 1.1% | |
| 21 | 14 | 1.1% | |
| 6 | 10 | 0.8% | |
| Other values (17) | 69 | 5.2% |
| Value | Count | Frequency (%) | |
| 1 | 21 | 1.6% | |
| 2 | 988 | 75.1% | |
| 3 | 98 | 7.5% | |
| 4 | 8 | 0.6% | |
| 5 | 16 | 1.2% |
| Value | Count | Frequency (%) | |
| 31 | 1 | 0.1% | |
| 30 | 30 | 2.3% | |
| 28 | 9 | 0.7% | |
| 27 | 2 | 0.2% | |
| 25 | 3 | 0.2% |
| Distinct | 39 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.051711027 |
|---|---|
| Minimum | 1 |
| Maximum | 50 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 4 |
| Q3 | 5 |
| 95-th percentile | 33 |
| Maximum | 50 |
| Range | 49 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 9.591047467 |
|---|---|
| Coefficient of variation (CV) | 1.36010217 |
| Kurtosis | 5.622520047 |
| Mean | 7.051711027 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.564630811 |
| Sum | 9273 |
| Variance | 91.98819151 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=39)
| Value | Count | Frequency (%) | |
| 4 | 443 | 33.7% | |
| 3 | 219 | 16.7% | |
| 1 | 178 | 13.5% | |
| 5 | 112 | 8.5% | |
| 6 | 83 | 6.3% | |
| 2 | 65 | 4.9% | |
| 33 | 51 | 3.9% | |
| 26 | 29 | 2.2% | |
| 35 | 17 | 1.3% | |
| 7 | 14 | 1.1% | |
| Other values (29) | 104 | 7.9% |
| Value | Count | Frequency (%) | |
| 1 | 178 | 13.5% | |
| 2 | 65 | 4.9% | |
| 3 | 219 | 16.7% | |
| 4 | 443 | 33.7% | |
| 5 | 112 | 8.5% |
| Value | Count | Frequency (%) | |
| 50 | 1 | 0.1% | |
| 48 | 9 | 0.7% | |
| 47 | 5 | 0.4% | |
| 43 | 1 | 0.1% | |
| 42 | 1 | 0.1% |
| Distinct | 100 |
|---|---|
| Distinct (%) | 7.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10167.04715 |
|---|---|
| Minimum | 10001 |
| Maximum | 11691 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 10001 |
|---|---|
| 5-th percentile | 10001 |
| Q1 | 10016 |
| median | 10022 |
| Q3 | 10038 |
| 95-th percentile | 11212 |
| Maximum | 11691 |
| Range | 1690 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 371.1828445 |
|---|---|
| Coefficient of variation (CV) | 0.03650842167 |
| Kurtosis | 4.173415413 |
| Mean | 10167.04715 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 2.417939814 |
| Sum | 13369667 |
| Variance | 137776.704 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 10022 | 108 | 8.2% | |
| 10017 | 95 | 7.2% | |
| 10019 | 90 | 6.8% | |
| 10016 | 79 | 6.0% | |
| 10018 | 78 | 5.9% | |
| 10036 | 71 | 5.4% | |
| 10001 | 69 | 5.2% | |
| 10023 | 58 | 4.4% | |
| 10128 | 39 | 3.0% | |
| 10028 | 38 | 2.9% | |
| Other values (90) | 590 | 44.9% |
| Value | Count | Frequency (%) | |
| 10001 | 69 | 5.2% | |
| 10002 | 16 | 1.2% | |
| 10003 | 15 | 1.1% | |
| 10004 | 28 | 2.1% | |
| 10005 | 29 | 2.2% |
| Value | Count | Frequency (%) | |
| 11691 | 2 | 0.2% | |
| 11435 | 1 | 0.1% | |
| 11433 | 1 | 0.1% | |
| 11415 | 1 | 0.1% | |
| 11379 | 1 | 0.1% |
landuse
Real number (ℝ≥0)
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.219011407 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 4 |
| median | 4 |
| Q3 | 5 |
| 95-th percentile | 5 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.9464041998 |
|---|---|
| Coefficient of variation (CV) | 0.2243189479 |
| Kurtosis | 3.027967883 |
| Mean | 4.219011407 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.9377365481 |
| Sum | 5548 |
| Variance | 0.8956809093 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=7)
| Value | Count | Frequency (%) | |
| 5 | 494 | 37.6% | |
| 4 | 482 | 36.7% | |
| 3 | 309 | 23.5% | |
| 8 | 26 | 2.0% | |
| 6 | 2 | 0.2% | |
| 2 | 1 | 0.1% | |
| 1 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 1 | 0.1% | |
| 2 | 1 | 0.1% | |
| 3 | 309 | 23.5% | |
| 4 | 482 | 36.7% | |
| 5 | 494 | 37.6% |
| Value | Count | Frequency (%) | |
| 8 | 26 | 2.0% | |
| 6 | 2 | 0.2% | |
| 5 | 494 | 37.6% | |
| 4 | 482 | 36.7% | |
| 3 | 309 | 23.5% |
| Distinct | 1198 |
|---|---|
| Distinct (%) | 91.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 43980.41749 |
|---|---|
| Minimum | 1506 |
| Maximum | 5048550 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 1506 |
|---|---|
| 5-th percentile | 4895.3 |
| Q1 | 10847 |
| median | 21275 |
| Q3 | 41693 |
| 95-th percentile | 136977.2 |
| Maximum | 5048550 |
| Range | 5047044 |
| Interquartile range (IQR) | 30846 |
Descriptive statistics
| Standard deviation | 153041.8282 |
|---|---|
| Coefficient of variation (CV) | 3.479772066 |
| Kurtosis | 872.7974641 |
| Mean | 43980.41749 |
| Median Absolute Deviation (MAD) | 12534 |
| Skewness | 27.08598631 |
| Sum | 57834249 |
| Variance | 2.342180119e+10 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 9875 | 8 | 0.6% | |
| 7406 | 7 | 0.5% | |
| 10042 | 6 | 0.5% | |
| 7531 | 5 | 0.4% | |
| 6025 | 5 | 0.4% | |
| 12552 | 4 | 0.3% | |
| 5021 | 4 | 0.3% | |
| 24100 | 4 | 0.3% | |
| 4938 | 4 | 0.3% | |
| 7500 | 4 | 0.3% | |
| Other values (1188) | 1264 | 96.1% |
| Value | Count | Frequency (%) | |
| 1506 | 2 | 0.2% | |
| 1942 | 1 | 0.1% | |
| 2025 | 1 | 0.1% | |
| 2143 | 1 | 0.1% | |
| 2150 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 5048550 | 1 | 0.1% | |
| 833945 | 1 | 0.1% | |
| 746956 | 1 | 0.1% | |
| 659375 | 1 | 0.1% | |
| 622700 | 1 | 0.1% |
bldgarea
Real number (ℝ≥0)
| Distinct | 1298 |
|---|---|
| Distinct (%) | 98.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 460533.981 |
|---|---|
| Minimum | 1344 |
| Maximum | 13540113 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 1344 |
|---|---|
| 5-th percentile | 74242.1 |
| Q1 | 184080.5 |
| median | 323029 |
| Q3 | 541097.5 |
| 95-th percentile | 1263334.4 |
| Maximum | 13540113 |
| Range | 13538769 |
| Interquartile range (IQR) | 357017 |
Descriptive statistics
| Standard deviation | 601261.4827 |
|---|---|
| Coefficient of variation (CV) | 1.305574632 |
| Kurtosis | 199.8663707 |
| Mean | 460533.981 |
| Median Absolute Deviation (MAD) | 163517 |
| Skewness | 10.75492508 |
| Sum | 605602185 |
| Variance | 3.615153705e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 623806 | 3 | 0.2% | |
| 332608 | 3 | 0.2% | |
| 225000 | 2 | 0.2% | |
| 431000 | 2 | 0.2% | |
| 50648 | 2 | 0.2% | |
| 177000 | 2 | 0.2% | |
| 224400 | 2 | 0.2% | |
| 272334 | 2 | 0.2% | |
| 96420 | 2 | 0.2% | |
| 216247 | 2 | 0.2% | |
| Other values (1288) | 1293 | 98.3% |
| Value | Count | Frequency (%) | |
| 1344 | 1 | 0.1% | |
| 3146 | 1 | 0.1% | |
| 3280 | 1 | 0.1% | |
| 12000 | 1 | 0.1% | |
| 23805 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 13540113 | 1 | 0.1% | |
| 8837500 | 1 | 0.1% | |
| 3693539 | 1 | 0.1% | |
| 3221237 | 1 | 0.1% | |
| 2907315 | 1 | 0.1% |
numfloors
Real number (ℝ≥0)
| Distinct | 62 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 32.33307985 |
|---|---|
| Minimum | 20.5 |
| Maximum | 104 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 20.5 |
|---|---|
| 5-th percentile | 21 |
| Q1 | 24 |
| median | 30 |
| Q3 | 38 |
| 95-th percentile | 54 |
| Maximum | 104 |
| Range | 83.5 |
| Interquartile range (IQR) | 14 |
Descriptive statistics
| Standard deviation | 11.36545646 |
|---|---|
| Coefficient of variation (CV) | 0.3515117184 |
| Kurtosis | 3.653381985 |
| Mean | 32.33307985 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 1.625893713 |
| Sum | 42518 |
| Variance | 129.1736005 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 21 | 150 | 11.4% | |
| 22 | 89 | 6.8% | |
| 23 | 78 | 5.9% | |
| 24 | 72 | 5.5% | |
| 25 | 70 | 5.3% | |
| 26 | 69 | 5.2% | |
| 32 | 55 | 4.2% | |
| 30 | 52 | 4.0% | |
| 31 | 47 | 3.6% | |
| 27 | 46 | 3.5% | |
| Other values (52) | 587 | 44.6% |
| Value | Count | Frequency (%) | |
| 20.5 | 3 | 0.2% | |
| 21 | 150 | 11.4% | |
| 22 | 89 | 6.8% | |
| 22.5 | 1 | 0.1% | |
| 23 | 78 | 5.9% |
| Value | Count | Frequency (%) | |
| 104 | 1 | 0.1% | |
| 90 | 1 | 0.1% | |
| 88 | 2 | 0.2% | |
| 78 | 1 | 0.1% | |
| 77 | 1 | 0.1% |
unitstotal
Real number (ℝ≥0)
| Distinct | 495 |
|---|---|
| Distinct (%) | 37.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 218.8486692 |
|---|---|
| Minimum | 0 |
| Maximum | 10948 |
| Zeros | 9 |
| Zeros (%) | 0.7% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 36 |
| median | 139 |
| Q3 | 306 |
| 95-th percentile | 653.2 |
| Maximum | 10948 |
| Range | 10948 |
| Interquartile range (IQR) | 270 |
Descriptive statistics
| Standard deviation | 387.4176722 |
|---|---|
| Coefficient of variation (CV) | 1.770253727 |
| Kurtosis | 450.1455406 |
| Mean | 218.8486692 |
| Median Absolute Deviation (MAD) | 118 |
| Skewness | 16.94744441 |
| Sum | 287786 |
| Variance | 150092.4527 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 111 | 8.4% | |
| 2 | 46 | 3.5% | |
| 3 | 16 | 1.2% | |
| 184 | 13 | 1.0% | |
| 4 | 13 | 1.0% | |
| 40 | 10 | 0.8% | |
| 17 | 9 | 0.7% | |
| 29 | 9 | 0.7% | |
| 0 | 9 | 0.7% | |
| 65 | 8 | 0.6% | |
| Other values (485) | 1071 | 81.4% |
| Value | Count | Frequency (%) | |
| 0 | 9 | 0.7% | |
| 1 | 111 | 8.4% | |
| 2 | 46 | 3.5% | |
| 3 | 16 | 1.2% | |
| 4 | 13 | 1.0% |
| Value | Count | Frequency (%) | |
| 10948 | 1 | 0.1% | |
| 3027 | 1 | 0.1% | |
| 1706 | 1 | 0.1% | |
| 1615 | 1 | 0.1% | |
| 1604 | 1 | 0.1% |
yearbuilt
Real number (ℝ≥0)
| Distinct | 119 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1972.610646 |
|---|---|
| Minimum | 0 |
| Maximum | 2020 |
| Zeros | 3 |
| Zeros (%) | 0.2% |
| Memory size | 10.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1920.7 |
| Q1 | 1962 |
| median | 1980 |
| Q3 | 2006 |
| 95-th percentile | 2018 |
| Maximum | 2020 |
| Range | 2020 |
| Interquartile range (IQR) | 44 |
Descriptive statistics
| Standard deviation | 99.5519592 |
|---|---|
| Coefficient of variation (CV) | 0.05046711037 |
| Kurtosis | 350.5855389 |
| Mean | 1972.610646 |
| Median Absolute Deviation (MAD) | 23 |
| Skewness | -17.79287381 |
| Sum | 2593983 |
| Variance | 9910.592581 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 2015 | 41 | 3.1% | |
| 1963 | 38 | 2.9% | |
| 2018 | 34 | 2.6% | |
| 1930 | 33 | 2.5% | |
| 1987 | 33 | 2.5% | |
| 1964 | 32 | 2.4% | |
| 2006 | 31 | 2.4% | |
| 2016 | 30 | 2.3% | |
| 1929 | 28 | 2.1% | |
| 1986 | 28 | 2.1% | |
| Other values (109) | 987 | 75.1% |
| Value | Count | Frequency (%) | |
| 0 | 3 | 0.2% | |
| 1883 | 1 | 0.1% | |
| 1885 | 1 | 0.1% | |
| 1895 | 1 | 0.1% | |
| 1896 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 2020 | 22 | 1.7% | |
| 2019 | 27 | 2.1% | |
| 2018 | 34 | 2.6% | |
| 2017 | 25 | 1.9% | |
| 2016 | 30 | 2.3% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| df_index | borough | block | schooldist | council | zipcode | landuse | lotarea | bldgarea | numfloors | unitstotal | yearbuilt | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 658320 | MN | 1265 | 2.0 | 4.0 | 10020.0 | 5.0 | 107766.0 | 2117061.0 | 70.0 | 109.0 | 1937.0 |
| 1 | 201620 | MN | 1299 | 2.0 | 4.0 | 10017.0 | 5.0 | 17573.0 | 629323.0 | 43.0 | 17.0 | 1982.0 |
| 2 | 156742 | MN | 1308 | 2.0 | 4.0 | 10022.0 | 5.0 | 81325.0 | 1526121.0 | 39.0 | 2.0 | 1969.0 |
| 3 | 346605 | BK | 3077 | 14.0 | 34.0 | 11206.0 | 3.0 | 257500.0 | 751412.0 | 21.0 | 772.0 | 1965.0 |
| 4 | 270630 | MN | 1536 | 2.0 | 5.0 | 10128.0 | 3.0 | 153080.0 | 666393.0 | 42.0 | 648.0 | 1975.0 |
| 5 | 630292 | MN | 10 | 2.0 | 1.0 | 10004.0 | 5.0 | 15445.0 | 336025.0 | 24.0 | 97.0 | 1930.0 |
| 6 | 128222 | MN | 1505 | 2.0 | 4.0 | 10128.0 | 4.0 | 22102.0 | 302439.0 | 32.0 | 212.0 | 1984.0 |
| 7 | 164312 | MN | 699 | 2.0 | 3.0 | 10001.0 | 4.0 | 22219.0 | 143052.0 | 25.0 | 41.0 | 2014.0 |
| 8 | 269434 | MN | 1318 | 2.0 | 4.0 | 10017.0 | 4.0 | 7537.0 | 109822.0 | 35.0 | 8.0 | 2015.0 |
| 9 | 200453 | MN | 997 | 2.0 | 4.0 | 10036.0 | 5.0 | 16820.0 | 471985.0 | 48.0 | 1.0 | 1988.0 |
Last rows
| df_index | borough | block | schooldist | council | zipcode | landuse | lotarea | bldgarea | numfloors | unitstotal | yearbuilt | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1305 | 642313 | MN | 1485 | 2.0 | 5.0 | 10021.0 | 8.0 | 39547.0 | 757439.0 | 24.0 | 1.0 | 2015.0 |
| 1306 | 200583 | MN | 1024 | 2.0 | 3.0 | 10019.0 | 5.0 | 23900.0 | 762619.0 | 35.0 | 1.0 | 1987.0 |
| 1307 | 153741 | MN | 1314 | 2.0 | 4.0 | 10016.0 | 8.0 | 19701.0 | 279254.0 | 25.0 | 24.0 | 2001.0 |
| 1308 | 608074 | BX | 2623 | 7.0 | 17.0 | 10455.0 | 3.0 | 166139.0 | 422400.0 | 22.0 | 471.0 | 1960.0 |
| 1309 | 746569 | MN | 1280 | 2.0 | 4.0 | 10017.0 | 5.0 | 57282.0 | 1028194.0 | 26.0 | 13.0 | 1919.0 |
| 1310 | 268057 | MN | 840 | 2.0 | 4.0 | 10018.0 | 5.0 | 4148.0 | 88551.0 | 34.0 | 173.0 | 2018.0 |
| 1311 | 274079 | MN | 2170 | 6.0 | 10.0 | 10040.0 | 3.0 | 96675.0 | 223200.0 | 21.0 | 205.0 | 1959.0 |
| 1312 | 680761 | MN | 861 | 2.0 | 4.0 | 10016.0 | 4.0 | 8400.0 | 175687.0 | 35.0 | 166.0 | 2008.0 |
| 1313 | 200392 | MN | 811 | 2.0 | 3.0 | 10018.0 | 5.0 | 19750.0 | 408511.0 | 22.0 | 88.0 | 1925.0 |
| 1314 | 227378 | MN | 1037 | 2.0 | 3.0 | 10036.0 | 5.0 | 3292.0 | 75902.0 | 29.0 | 1.0 | 2014.0 |